Clustering Polysemic Subcategorization Frame Distributions Semantically
نویسندگان
چکیده
Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguated corpus data. We describe a new approach which involves clustering subcategorization frame (SCF) distributions using the Information Bottleneck and nearest neighbour methods. In contrast to previous work, we particularly focus on clustering polysemic verbs. A novel evaluation scheme is proposed which accounts for the effect of polysemy on the clusters, offering us a good insight into the potential and limitations of semantically classifying undisambiguated SCF data.
منابع مشابه
Inducing a Semantically Annotated Lexicon via EM-Based Clustering
We present a technique for automatic induction of slot annotations for subcategorization frames, based on induction of hidden classes in the EM framework of statistical estimation. The models are empirically evalutated by a general decision test. Induction of slot labeling for subcategorization frames is accomplished by a further application of EM, and applied experimentally on frame observatio...
متن کاملClustering Verbs Semantically According to their Alternation Behaviour
Verbs were clustered semantically on the basis of their alternation behaviour, as characterised by their syntactic subcategorisation frames extracted from maximum probability parses of a robust statistical parser, and completed by assigning WordNet classes as selectional preferences to the frame arguments. The clustering was achieved (a) iteratively by measuring the relative entropy between the...
متن کاملSubcategorization acquisition
Manual development of large subcategorised lexicons has proved difficult because predicates change behaviour between sublanguages, domains and over time. Yet access to a comprehensive subcategorization lexicon is vital for successful parsing capable of recovering predicate-argument relations, and probabilistic parsers would greatly benefit from accurate information concerning the relative likel...
متن کاملDetecting Dependencies Between Semantic Verb Subclasses And Subcategorization Frames In Text Corpora
There is a widespread belief among linguists that a predicate's subcategorization frames are largely determined by its lexical-semantic properties [23, 11, 12]. Consider the domain of movement verbs. Following Talmy [23], these can he semantically classified with reference to the meaning components: MOTION, MANNER, CAUSATION, THEME (MOVING ENTITY), PATH AND REFERENCE LOCATIONS (GOAL, SOURCE). L...
متن کاملA Corpus-based Conceptual Clustering Method for Verb Frames and Ontology Acquisition
We describe in this paper the ML system, ASIUM, which learns subcategorization frames of verbs and ontologies from syntactic parsing of technical texts in natural language. The restrictions of selection in the subcategorization frames are filled by the concepts of the ontology. Applications requiring subcategorization frames and ontologies are crucial and numerous. The most direct applications ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003